fix(inference): AttributeError in streaming response cleanup#4236
fix(inference): AttributeError in streaming response cleanup#4236mattf merged 12 commits intollamastack:mainfrom
Conversation
mattf
left a comment
There was a problem hiding this comment.
@r-bit-rry a chain of hasattr like this suggests we've done something wrong in the design. have we or can we just call close?
It really comes down to what we want to support, since this was never strictly typed, I'm assuming there are other objects that can be generated by the sse_generator. and on a more serious note @mattf so its our decision if we want to enforce certain typings, and act upon, or let this pattern "catch all" |
mattf
left a comment
There was a problem hiding this comment.
as propose, the hasattr chain will cover up an api contract bug somewhere in the system.
an AsyncStream is making it to a place where only AsyncIterators should be.
i did a little sleuthing and i think there is a bug in at least _maybe_overwrite_id. there are multiple provider impls, so there may be others.
will you find the places where the api contract is being violated and patch them?
also, will you create a regression test that at least tests the openai mixin provider?
|
@mattf sure thing, I'll start working on those |
|
@mattf We're facing two options in order to avoid hasattr chain as I see it when treating the this is explicit and simple but carries a small overhead per chunk option 2: adapter pattern direct delegation with no re-yielding and a more explicit intent regarding locations of violations, where we will need patching, these are the places I was able to spot:
returned AsyncStream (has close()) instead of AsyncIterator (has aclose())
Returned raw client.chat.completions.create() response
Returned raw client.completions.create() response
Returned raw litellm.acompletion() result
|
|
@r-bit-rry great finds! it looks like we're violating the api contract and using |
|
@mattf I suggest using the first option, with direct declaration of the types, and changing the API contract for openai_completion, removing the #type: ignore comments. |
|
@r-bit-rry sounds good! |
|
@mattf as I'm digging, I'm finding new stuff, apparently we have a discrepancy between official and internal documentation of openai_completion and the implementation in place: however the documentation both on our end and on openai official end does not support streaming to begin with (docs/docs/contributing/new_api_provider.mdx:40-43:
OpenAI documentation: |
mattf
left a comment
There was a problem hiding this comment.
can we get rid of the type ignores now?
what about passthrough? |
regarding the passthrough The issue is just a type mismatch between what the OpenAI SDK returns (Completion/ChatCompletion) and what Llama Stack types expect (OpenAICompletion/OpenAIChatCompletion). Structurally they're the same, but as far as I understand mypy doesn't see them as identical. The streaming path (wrap_async_stream(response)) doesn't have any type ignores and doesn't need one since wrap_async_stream properly returns AsyncIterator[T]. So removing these would need either explicit type conversions or aligning llamastack type definitions with the SDK types directly, I think this is out of scope for this fix. |
i see. pls file an issue about that so we can resolve it later. |
✱ Stainless preview buildsThis PR will update the
|
…ack#4236) This PR fixes issue llamastack#3185 The code calls `await event_gen.aclose()` but OpenAI's `AsyncStream` doesn't have an `aclose()` method - it has `close()` (which is async). when clients cancel streaming requests, the server tries to clean up with: ```python await event_gen.aclose() # ❌ AsyncStream doesn't have aclose()! ``` But `AsyncStream` has never had a public `aclose()` method. The error message literally tells us: ``` AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'? ^^^^^^^^ ``` ## Verification * Reproduction script [`reproduce_issue_3185.sh`](https://gist.github.com/r-bit-rry/dea4f8fbb81c446f5db50ea7abd6379b) can be used to verify the fix. * Manual checks, validation against original OpenAI library code
This PR fixes issue #3185
The code calls
await event_gen.aclose()but OpenAI'sAsyncStreamdoesn't have anaclose()method - it hasclose()(which is async).when clients cancel streaming requests, the server tries to clean up with:
But
AsyncStreamhas never had a publicaclose()method. The error message literally tells us:Verification
reproduce_issue_3185.shcan be used to verify the fix.